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Abstract 

We consider the concept of mutual information in ecological networks, and use this idea to analyse 
the Tangled Nature model of co-evolution. We show that this measure of correlation has two distinct 
behaviours depending on how we define the network in question: if we consider only the network of 
viable species this measure increases, whereas for the whole system it decreases. It is suggested that 
these are complimentary behaviours that show how ecosystems can become both more stable and better 
adapted. 



1 Motivation 

Identifying universal features of ecosystem dynam- 
ics has been a long-standing goal in ecology. These 
attempts have usually involved idcntifiying system 
variables that are potentiality optimised during the 
evolution of an ecosystem. Many such candidate 
variables have been identified. Increasingly the 
focus has been on the network properties of the 
ecosystem, or more precisely the trophic net defined 
by the mass flows between the species constituting 
the ecosystem. However empirical evidence at the 
resolution needed to verify any particular claim re- 
mains out of reach for most studies. For ecologists 
these quantities are both of theoretical and prac- 
tical interest. From a theoretical point of view it 
would be nice, as already noted, to find some gov- 
erning principle of ecological dynamics, while prac- 
tically speaking there is a need to establish a good 
measure of ecosystem health and maturity [1] [2] . 

In this paper we propose to study this issue in the 
context of a well established evolutionary model. 
The Tangled Nature model of co-evolution |3J has 
already been studied in several contexts [H [5l [6] 
and is ideal for this work as it is designed specifi- 
cally to study long time behaviour in ecological net- 
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works. Its simplicity along with the rich complex- 
ity of its resulting behaviour makes it a paradig- 
matic model for testing co-evolutionary ideas. The 
model retains the binary string genotype geometry 
found in previous approaches (for example the qua- 
sispecies model [7]or the NK model j8], but replaces 
their 'ad hoc' static fitness landscapes with a set of 
population dependent interactions between extant 
species, similar to the 'tangled' interactions of an 
eco-system. From a 'random' initial state, the net- 
work of extant and interacting population changes 
over time, slowly, but radically, enabling the system 
to support an ever growing number of individuals. 

Despite its simplicity, the model is able to repro- 
duce the long time decrease reported in the over- 
all macroscopic extinction rate, the observed inter- 
mittent nature of macro-evolution, denoted punc- 
tuated equilibrium by Gould and Eldredge, the 
log-normal shape often observed for the Species 
Abundance Distributions, the power law relation 
often seen between area and the number of different 
species number, the framework of the model is also 
able to reproduce often reported exponential degree 
distributions of the network of species as well as the 
decreasing connectance with increasing species di- 
versity that has attracted much observational and 
theoretical interest. 

The details of the model are described in greater 
detail below, but the key aspect of its behaviour is 
that it moves through a series of different network 



configurations. In this paper we analyse these dy- 
namic networks using tools developed in ecology. In 
particular, we are able to shed light on the tension 
between robustness and efficiency in ecological net- 
works highlighted by Jorgensen et al [9j . Increased 
correlation lead to greater brittleness in the case of 
perturbations, but greater robustness leads to an 
apparent squandering of resources. We suggest how 
this conflict can be resolved using evidence from 
Tangled Nature, where it is possible to divide the 
system into two interacting parts - a viable network 
of keystone species, and a periphery of unviable 
mutants. Seen from this perspective the apparent 
paradox is resolved, as the viable network becomes 
increasingly correlated, while the total network (in- 
cluding many species in potentia) develops greater 
redundancy. 

2 Review of the basic be- 
haviour of the model 

2.1 Type space and the interaction 
matrix 

A type is represented by a vector S of L elements 
belonging to the set [0, 1]. Thus there are 2 L pos- 
sible types, corresponding to the vertices of a unit 
hypercube in L-dimensions. S may be interpreted 
as a genome, or a set of characteristics - cither way 
it is directly susceptible to mutations and defines 
the type completely (that is there is no phenotype 
level in this model). Each type, which we can index 
by a number i in the range 1 — 2 L to simplify nota- 
tion, has a population of rii(t) identical individuals, 
so the total population is the sum over all the 2 L 
possible types 

N(t) = J2n*(t) (2.1) 

i=\ 

The ability of an individual to reproduce is de- 
termined by how it interacts with the other types 
present at a given time. This is formalised in the 
reproduction weight function (which is then turned 
into a probability of reproducing - see below) 

where the sum is over all other types, C is a control 
parameter that determines the level of inhomogene- 
ity in the population, N(t) is the total population 
at time t, and n(S, t) is the population of type S. 

Two types Si and Sj are coupled via the in- 
teraction matrix J(Si,Sj) that can be cither pos- 



itive negative or zero. This number is intended 
to be the sum of all the influences of i upon j. 
This interaction matrix is unrelated to the type 
space outlined above so there are no correlations 
in the interactions between different types - that 
is < JrSi, Sj)J(Sk, Sj) >— even if the average 
is restricted to neighbours in type space. This in- 
teraction is not necessarily material in nature but 
may represent any influence that one type has on 
another. The overall connectivity of the interaction 
matrix is set by a parameter which for this paper 
has a value of 0.2 (that is 0.2 of all possible connec- 
tions between types actually exist). The distribu- 
tion of the nonzero values of the function J(5*i, Sj) 
are irrelevant as long as they are distributed in some 
reasonable, continuous way. The interaction ma- 
trix is constructed such that if J (Si, Sj) is nonzero 
then J(Sj, Si) is also nonzero. This means there are 
three types of interaction - mutualistic, antagonis- 
tic and predator-prey. Figure [T] illustrates the key 
components of the tangled nature model - the hy- 
percubic type space, varying type occupancies, and 
the different types of possible interaction between 
types. 




Figure 1: An example of the configuration of the 
Tangled Nature system in a meta-stable state. This 
is a 4 dimensional model for expository purposes 
only, the model in this paper has 20 dimensions. 
The vertices of the hypercube represent the 16 pos- 
sible types in the model. The dotted lines represent 
nearest neighbour links in type space, and the solid 
lines represent non-zero interaction terms with blue 
= +-, red = -, green = ++ 



2.2 Reproduction, mutations and 
death 

The model is simulated stochastically, with a time- 
step consisting of the following: one individual is 
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selected at random, and reproduces asexually acc- 
cording to the probability 



P r (Si,t) 



l + exp[H(Si,t)] 



e [0, 1] (2.3) 



If successful the individual is replaced with two 
copies. In each of these copies there is a probability 
of mutation per 'gene', p rn . Another individual is 
picked at random and is killed with probability 

2.3 General behaviour of the model 

We start a run with iV(0) = 1000 individuals on one 
randomly chosen site. Initially there is no repro- 
duction, since there can be no interactions between 
species, so H is very negative and the probability 
of reproduction is zero. Then as the resource limi- 
tation term diminishes, reproduction becomes pos- 
sible, and consequently some new types are gen- 
erated by mutations. Once interactions between 
these new types begins, the interaction term in the 
reproduction probability becomes significant. After 
some re-organisation, a set of species that interact 
in a stable way emerges, and persists for some time 
(see figure^]). This period of stability is ended by 
another chaotic reorganisation, from which another 
meta-stable state emerges. 

The bulk properties of these meta-stable states 
turn out to depend on the age of the system - the 
system slowly optimises the interactions between 
species, as evidenced for example by the logarith- 
mically increasing population (figure [3]) . It is this 
non-stationary aspect of the model that this paper 
tries to explain, albeit only partially. 



3 Results 

We ran 1500 simulations of the model with an ini- 
tial population confined to one randomly chosen 
site. The random interaction matrix was regen- 
erated each time. The parameters used for all the 
runs were the same, and were chosen to robustly 
generate the intermittent regime for a population 
of a manageable size. 

We use the following parameter values: [i = 
0.14, Pmut = 0.03, PklU = 0.2, c = 10. Detailed 
discussion of the various regimes defined by these 
parameters can be found elsewhere; for now we sim- 
ply note that the behaviour generated by this set 
is characteristic of a significant area of parameter 
space. The one major change is seen when p mu t 
goes above the error threshold, which results in dif- 
fusion dominated behaviour. 
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Figure 2: Overview of a typical run of TaNa. The 
y-axis is simply a species label, ranging from 1 - 
2 L , and the x-axis is time in generations. If a posi- 
tion is occupied at a given time, a dot is placed at 
the corresponding number for that time step. The 
plot clearly shows the alternating stable and unsta- 
ble periods. The stable periods are characteristised 
by a steady population and constant set of species, 
whereas the transitions have a constantly chang- 
ing set of species (eg between 100 000 and 150 000 
generations) Figure from [3]. 
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Figure 3: The mean population (averaged over an 
ensemble of 1000 runs) increases logarithmically in 
time 
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3.1 The Core and the Periphery 

The network realised at any given time can be di- 
vided into two classes - those nodes that are viable 
(loosely, those that have a birthrate approximately 
equal to the death rate) and those that aren't. This 
second group are the mutants from the viable core, 
who in the current configuration are not able to 
reproduce. Figure E] schematically depicts this ar- 
rangement, with each viable species having a flower 
of unviable mutants surrounding it. These mutants 
do not, in general play an active role (even as a sta- 
bilising factor) during a stable period, but they are 
in the end responsible for the eventual collapse of 
one metastable state and creation of another. The 
following results are obtained for both the whole 
system, and the viable core. 




Figure 4: A typical stable ecological network seen 
in the Tangled Nature model. Each circle repre- 
sents a species with the size proportional to the cur- 
rent population. Lines represent interaction links 
(| J | > 0) between species. The mutation network 
has been suppressed to clarify the figure but the 
line indices give the hamming distance (a measure 
of evolutionary separation) between each species. 
The core is the network of large nodes, and the 
whole system includes the smaller unlabelled nodes. 
Figure from [3]. 



3.2 Mutual Information 

Ideas from information theory have been used in 
ecology for over 50 years [TU] [H], and Rutledge et 
al introduced the idea of using the mutual infor- 
mation of networks as a measure of their stability 
|12j . This was all somewhat unnoticed by those 
working more recently on networks in graph theory 



and complexity. This is principally due to the fact 
that ecologists must work with weighted networks, 
whereas most recent work on network character- 
isation has focussed on unweighted networks, for 
which there exist a large arsenal of analytical tools. 

First we define what the mutual information is 
for a general random process, then we will define 
how we use this measure in this paper. The infor- 
mation of a realisation x of a random variable X is 
defined via its probability distribution P(x), as 

I{x) = P{x)\ogP{x) (3.1) 

For two random variables, we can define the mutual 
information, which is defined as the reduction in the 
uncertainty of X given knowledge of Y . The mutual 
information is defined on two random variables X 
and Y as 

^> = |><„)>o g (^L) (3„ 

where Pi and Pi are the marginal distributions of 
X and Y respectively, and P the joint probability 
distribution. Equally we can think of the mutual 
information as the constraint imposed on X by Y. 

The Tangled Nature model is a model of network 
evolution. As the structure of the network changes, 
we ask the question: how does the current network 
structure constrain its evolution? The network we 
consider is the interaction network J weighted by 
the occupancy of the species, so that we only con- 
sider connections between extant species. When 
this condition is met, we consider there to be niTij 
copies of link J^- . Consider the ensemble link value 
distribution at time t, P(J,t). This gives the prob- 
ability of a link value J for an ensemble of realisa- 
tions. However if we consider a particular realisa- 
tion, we can expect that this distribution, P(J, t, r) 
(where r indexes specific realisations) will in gen- 
eral differ from the ensemble average. We can mea- 
sure this difference by looking at the joint prob- 
ability distribution P(J%, J2, t, r). The degree to 
which this quantity differs from the product of the 
marginal distributions for J\ and J2 (which in our 
case are identical, equal to the distribution over the 
ensemble P{J,t)) measures the degree to which the 
presence of some link value J\ influences the pres- 
ence of some other value Ji. 

To consider the probability of a link value ap- 
pearing at time t, we first introduce a new variable 
which will simplify the following. We will consider 
a single index k that runs over all links in a reali- 
sation, and each link is waited by dk, the product 
of the occupancy of the two nodes at either end: 
dk = d(Jij) = riirij. Explicitly we define the rele- 
vant quantities as follows: the probability that the 
link value J appears at time t is 
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(a) The mutual information of the whole network as a 
function of time. We see a slow but significant decrease 
signifying a decorrelation of the component parts of the 
network. 
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(b) The mutual information of the core as a function of 
time. For this subset the mutual information increases 
over time, indicating greater correlation and efficiency. 



| ,.4 + 

+ + + 

1 ,. 

| + + 

2 0.8 1 1 1 1 1 1 1 1 

= 123456789 10 

Time(10 5 gens) 

(c) The mutual information of a random set of net- 
works using the same average diversity and population 
as the simulation results. The mutual information has 
no trend and is approximately 2 orders of magnitude 
smaller. 

Figure 5: Contrasting trends in the mutual infor- 
mation for different subsets 



P(J,t) = i^4«(/-4) (3.3) 



where D is the number of links counted between 
all extant species, D = ■ riiiij and R is the num- 
ber of realisations, whereas the joint probability 
distribution for two link values to appear in one 
realisation is 



P(J 1 ,J 2 ,t,r) = LY j {d k + di)5{J 1 -J h )5{J 2 -J l ) 



k.l 



^ (3.4) 

With these quantities defined, we may define the 
mutual information on these distributions as 

(3.5) 

Since the link distribution fluctuates due to the 
stochastic nature of the system, this distribution 
is calculated over a small time window 8t where 



at 



« 1. 



Figures 5(a) and |5(b)] show the evolution of the 
mutual information over time for two different sub- 
sets of the system. Figure 5(a) is the MI for the 
whole system, where we see a declining trend. The 
subset of vertices linking nodes with more than 5 
individuals by contrast displays an increase in the 
MI over time (figure |5(b)[ ) . 

We note that in general the mutual information 
is quite low, which is expected. We are measur- 
ing the influence of the presence of link values on 
the presence of other link values; this influence is 
highly constrained by the quenched randomness of 
the network and the stochastic dynamics, so in gen- 
eral we do not expect the mutual information to be 
high. Nevertheless we have compared the values ob- 
tained to simulated random networks of equivalent 
size and connectance, and found the mutual infor- 
mation to be approximately three orders of magni- 
tude smaller. 

The data is significantly noisy despite being the 
result of a large ensemble average. Nevertheless 
it is clear, especially for the whole system, that the 
curves are approximately linear in logarithmic time. 
This corresponds to the behaviour of other mea- 
sures of the system, and can possibly ultimately be 
related back to some record process. 

Averaging over more realisations increased the 
clarity of the results, but at the cost of computing 
time. To decrease the fluctuations by an order of 
magnitude would have required approximately 400 
weeks more computing time. 
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4 Discussion 

The question of how the structure of an ecosys- 
tem, or any system of interacting, evolving agents, 
changes over time is a controversial one, and to 
some extent depends on the details of the system 
under consideration. In this paper we have con- 
sidered a generic evolutionary model with the aim 
of elucidating ecological dynamics in the general 
case. The apparent competition between two re- 
quirements of a viable ecosystem - that they max- 
imise resource us on the one hand, and remain ro- 
bust to perturbations on the other - poses the ques- 
tion: what in fact happens? 

The obvious way to answer this question would 
be to do an experiment. However, ecological exper- 
iments of the type required (both in terms of detail 
and time resolution) are not currently possible. In- 
deed, ecological data recorded over evolutionarily 
significant timescales is practically unattainable for 
any but the fastest evolving systems, such as micro- 
bial populations (see for example [I3j). However, 
even for such experimentally manipulable systems 
it may be hard to infer interaction networks accu- 
rately. The practical difficulties of experiments in 
evolutionary ecology is one of the key reasons why 
we believe theoretical work such is that presented is 
important, since it can act both as spur and guide 
for future experimental work. 

We have found that while the ecosystem as a 
whole becomes less correlated over time, the cor- 
relation of the network of its core species increases. 
While we have not shown it here, it seem plausible 
that this is two sides of the same coin - decorrela- 
tion of the whole system implies that the system 
explores a greater range of possible networks, from 
which it chooses more and more well correlated sub- 
sets. This fits with other results we have obtained 
that show the model increases its population over 
time. 

When considering ecological networks, most 
work has naturally focussed on trophic networks, 
that is networks of material flow through an ecosys- 
tem. This has yielded a natural way to analyse 
these networks, since the dynamics is conservative, 
one can consider the probability of any two species 
being involved in material exchange. The Tangled 
Nature model explicitly models more than simply 
mass flow in ecosystems: it attempts to quantify the 
influence that one species has on another. While 
this has the advantage of allowing one to consider 
more than simply predator-prey relationships (for 
example mutualistic behaviour arises very naturally 
in the model), it means that one cannot simply take 
over tools used on trophic nets wholesale. In this 
paper we have adapted the approach used in ecol- 



ogy and elsewhere to this interaction view with the 
caveat that our results are not directly compara- 
ble to those gleaned from analysis of food webs; 
we did also attempt to interpret the model as a 
flow model but found that this approach yielded 
no clear information about the network structure. 
One possibility in this direction is to adopt the ap- 
proach in 14] where once a network has evolved one 
imagines some simple Markovian dynamics entirely 
independent of the actual model dynamics in order 
to determine the relevant network measures. 

We have not used any of the more simple infor- 
mation theoretic measures available ( for example 
the entropy). This is because we found it neces- 
sary to consider a quantity that characterised the 
difference of a specific realisation from an ensem- 
ble of realisations. The entropy of the system as 
a whole increases over time, but there is no cor- 
responding decrease in the core population. It is 
easy to see why: the entropy over an ensemble of 
realisations is simply the sum of individual reali- 
sations and so one would only expect to see a de- 
crease in entropy if every realisation converged on 
a small set of link values. This by no means has 
to be the case, since the system can adjust species 
populations to a wide range of networks. The mu- 
tual information, on the other hand, measures how 
the existence of certain links within one realisation 
determines the presence of other links within that 
same realisation and so does increase over time. It 
remains to be seen whether there is some entropic 
measure in Tangled Nature (or indeed in reality) 
which is maximised through evolution. 

One might naively think that the result for the 
core is simply due to the increasing stability of the 
system observed in other contexts. Taken by itself 
this is reasonable, since it is possible that the sys- 
tem stabilises over time, and that this stabilisation 
would positively contribute to mutual information 
of the core. However, if it was purely an artefact 
of the system spending more time in a stable con- 
figuration then we would expect the whole system 
(that is both the viable core and the surrounding 
mutants in figure 3]) to display a similar positive 
trend, which is clearly not the case. Therefore we 
conclude that the increasing correlation of the core, 
along with the increasing decorrelation of the pe- 
riphery of the system, plays a causal role in the sta- 
bilisation of the system as a whole. We postulate 
that these two phenomena are linked - the system 
explores a greater number of possible links which 
allows it to find better adapted sets of links for the 
core, which in turn leads to a bigger population and 
an even larger set of links to select from. While we 
do not claim to have proved that this is the case, 
the data is strong evidence that some adaptive be- 
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haviour of this type is occurring. In future papers 
we hope to probe the nature of this adaptive dy- 
namics further. 
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